Improve performance of scanning source files #15270

RobinMalfait · 2024-12-02T11:14:07Z

This PR improves scanning files by scanning chunks of the files in parallel. Each chunk is separated by new lines since we can't use whitespace in classes anyway.

This also means that we can use the power of your CPU to scan files faster. The extractor itself also has less state to worry about on these smaller chunks.

On a dedicated benchmark machine: Mac Mini, M1, 16 GB RAM

❯ hyperfine --warmup 15 --runs 50 \
  -n NEW 'bun --bun /Users/ben/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' \
  -n CURRENT 'bun --bun /Users/ben/github.com/tailwindlabs/tailwindcss--next/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css'
Benchmark 1: NEW
  Time (mean ± σ):     337.2 ms ±   2.9 ms    [User: 1376.6 ms, System: 80.9 ms]
  Range (min … max):   331.0 ms … 345.3 ms    50 runs

Benchmark 2: CURRENT
  Time (mean ± σ):     730.3 ms ±   3.8 ms    [User: 978.9 ms, System: 78.7 ms]
  Range (min … max):   722.0 ms … 741.8 ms    50 runs

Summary
  NEW ran
    2.17 ± 0.02 times faster than CURRENT

On a more powerful machine, MacBook Pro M1 Max, 64 GB RAM, the results look even more promising:

❯ hyperfine --warmup 15 --runs 50 \
  -n NEW 'bun --bun /Users/robin/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css' \
  -n CURRENT 'bun --bun /Users/robin/github.com/tailwindlabs/tailwindcss--next/packages/@tailwindcss-cli/src/index.ts -i ./tailwind.css -o out.css'
Benchmark 1: NEW
  Time (mean ± σ):     307.8 ms ±  24.5 ms    [User: 1124.8 ms, System: 187.9 ms]
  Range (min … max):   291.7 ms … 397.9 ms    50 runs

Benchmark 2: CURRENT
  Time (mean ± σ):     754.7 ms ±  27.2 ms    [User: 934.9 ms, System: 217.6 ms]
  Range (min … max):   735.5 ms … 845.6 ms    50 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  NEW ran
    2.45 ± 0.21 times faster than CURRENT

Note: This last benchmark is running on my main machine which is more "busy" compared to my benchmark machine. Because of this I had to increase the --runs to get statistically better results. There is still a warning present, but the overall numbers are still very promising.

These benchmarks are running on our Tailwind UI project where we have >1000 files, and >750 000 lines of code in those files.

Before	After

I am sure there is more we can do here, because reading all of these 1000 files only takes ~10ms, whereas parsing all these files takes ~180ms. But I'm still happy with these results as an incremental improvement.

For good measure, I also wanted to make sure that we didn't regress on smaller projects. Running this on Catalyst, we only have to deal with ~100 files and ~18 000 lines of code. In this case reading all the files takes ~890µs and parsing takes about ~4ms.

Before	After

Not a huge difference, still better and definitely no regressions which sounds like a win to me.

Edit: after talking to @thecrypticace, instead of splitting on any whitespace we just split on newlines. This makes the chunks a bit larger, but it reduces the overhead of the extractor itself. This now results in a 2.45x speedup in Tailwind UI compared to 1.94x speedup.

All the files are already covered when running the initial build. This is for subsequent builds (in dev/watch mode). This also prevents to do some work twice and just do it once.

The `part_sort` isn't a super big difference, however it goes from ~3ms to ~1ms to do the sorting.

We already know that we can't have whitespace in our candidates which means that we can operate on many many small chunks instead. This change means that we run a lot of extractors on very small pieces of code all in parallel, instead of running the extractor per file. In my testing, this results in 2x speedup (~800ms -> ~400ms on a big project with over a 1000 files).

To reduce overhead of the Extractor itself, we can chunk the work by lines instead of every whitespace-separated chunk. This seems to improve the overall cost even more! Co-authored-by: Jordan Pittman <jordan@cryptica.me>

crates/oxide/src/lib.rs

CHANGELOG.md

RobinMalfait changed the title ~~Improve scanning files in parallel~~ Improve performance of scanning source files Dec 2, 2024

RobinMalfait marked this pull request as ready for review December 2, 2024 11:17

RobinMalfait requested a review from a team as a code owner December 2, 2024 11:17

RobinMalfait and others added 5 commits December 2, 2024 17:44

only check for new files in subsequent builds

9d3da40

All the files are already covered when running the initial build. This is for subsequent builds (in dev/watch mode). This also prevents to do some work twice and just do it once.

use par_sort and into_part_iter

5c4a4ff

The `part_sort` isn't a super big difference, however it goes from ~3ms to ~1ms to do the sorting.

update changelog

e99f276

only split by newlines

8fe3977

To reduce overhead of the Extractor itself, we can chunk the work by lines instead of every whitespace-separated chunk. This seems to improve the overall cost even more! Co-authored-by: Jordan Pittman <jordan@cryptica.me>

RobinMalfait force-pushed the feat/scanner-performance branch from d273a39 to 8fe3977 Compare December 2, 2024 16:44

only split on newlines

abd6c8f

philipp-spiess reviewed Dec 2, 2024

View reviewed changes

crates/oxide/src/lib.rs Outdated Show resolved Hide resolved

RobinMalfait force-pushed the feat/scanner-performance branch from 35f40ca to abd6c8f Compare December 2, 2024 17:01

adamwathan reviewed Dec 2, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

cb8a7d7

thecrypticace approved these changes Dec 2, 2024

View reviewed changes

adamwathan approved these changes Dec 2, 2024

View reviewed changes

adamwathan merged commit 6af4835 into next Dec 2, 2024
1 check passed

adamwathan deleted the feat/scanner-performance branch December 2, 2024 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of scanning source files #15270

Improve performance of scanning source files #15270

RobinMalfait commented Dec 2, 2024 •

edited

Loading

Improve performance of scanning source files #15270

Improve performance of scanning source files #15270

Conversation

RobinMalfait commented Dec 2, 2024 • edited Loading

RobinMalfait commented Dec 2, 2024 •

edited

Loading